Every Layer Counts: Multi-Layer Multi-Head Attention for Neural Machine Translation
نویسندگان
چکیده
منابع مشابه
Paying Attention to Multi-Word Expressions in Neural Machine Translation
Processing of multi-word expressions (MWEs) is a known problem for any natural language processing task. Even neural machine translation (NMT) struggles to overcome it. This paper presents results of experiments on investigating NMT attention allocation to the MWEs and improving automated translation of sentences that contain MWEs in English→Latvian and English→Czech NMT systems. Two improvemen...
متن کاملMulti-channel Encoder for Neural Machine Translation
Attention-based Encoder-Decoder has the effective architecture for neural machine translation (NMT), which typically relies on recurrent neural networks (RNN) to build the blocks that will be lately called by attentive reader during the decoding process. This design of encoder yields relatively uniform composition on source sentence, despite the gating mechanism employed in encoding RNN. On the...
متن کاملNonsingular Green’s Functions for Multi-Layer Homogeneous Microstrip Lines
In this article, three new green's functions are presented for a narrow strip line (not a thin wire) inside or on a homogeneous dielectric, supposing quasi-TEM dominant mode. These functions have no singularity in contrast to so far presented ones, so that they can be used easily to determine the capacitance matrix of multi-layer and single-layer homogeneous coupled microstrip lines. To obtain ...
متن کاملStack-based Multi-layer Attention for Transition-based Dependency Parsing
Although sequence-to-sequence (seq2seq) network has achieved significant success in many NLP tasks such as machine translation and text summarization, simply applying this approach to transition-based dependency parsing cannot yield a comparable performance gain as in other stateof-the-art methods, such as stack-LSTM and head selection. In this paper, we propose a stack-based multi-layer attent...
متن کاملA Novel Reordering Model Based on Multi-layer Phrase for Statistical Machine Translation
Phrase reordering is of great importance for statistical machine translation. According to the movement of phrase translation, the pattern of phrase reordering can be divided into three classes: monotone, BTG (Bracket Transduction Grammar) and hierarchy. It is a good way to use different styles of reordering models to reorder different phrases according to the characteristics of both the reorde...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Prague Bulletin of Mathematical Linguistics
سال: 2020
ISSN: 1804-0462,0032-6585
DOI: 10.14712/00326585.005